Crosslinguistic Transfer in Automatic Verb Classi cation
نویسندگان
چکیده
We investigate the use of multilingual data in the automatic classiication of English verbs, and show that there is a useful transfer of information across languages. Speciically, we experiment with three lexical semantic classes of En-glish verbs. We collect statistical features over a sample of English verbs from each of the classes, as well as over Chinese translations of those verbs. We use the English and Chinese data, alone and in combination, as training data for a machine learning algorithm whose output is an automatic verb classiier. We demonstrate that Chinese data is indeed useful in helping to classify the English verbs (at 82% accuracy), and furthermore that a multilingual combination of data outperforms the English data alone (85% accuracy). Moreover, our results using monolin-gual corpora show that it is not necessary to use a parallel corpus to extract the translations in order for this technique to be successful.
منابع مشابه
Pr oc ee di ng s of th e 19 th C O L IN G , 1 02 3 - 10 29 , 2 00 2 . Crosslinguistic Transfer in Automatic Verb Classi cationVivian
We investigate the use of multilingual data in the automatic classiication of English verbs, and show that there is a useful transfer of information across languages. Speciically, we experiment with three lexical semantic classes of En-glish verbs. We collect statistical features over a sample of English verbs from each of the classes, as well as over Chinese translations of those verbs. We use...
متن کاملAutomatic verb classification using multilingual resources
We propose the use of multilingual corpora in the automatic classi cation of verbs. We extend the work of (Merlo and Stevenson, 2001), in which statistics over simple syntactic features extracted from textual corpora were used to train an automatic classi er for three lexical semantic classes of English verbs. We hypothesize that some lexical semantic features that are di cult to detect super c...
متن کاملAutomatic Verb Classi cation Using Multilingual Resources
We propose the use of multilingual corpora in the automatic classiication of verbs. We extend the work of (Merlo and Stevenson, 2001), in which statistics over simple syntactic features extracted from textual corpora were used to train an automatic classiier for three lexical semantic classes of English verbs. We hypothesize that some lexical semantic features that are diicult to detect superrc...
متن کاملAutomatic Lexical Acquisition Based on Statistical Distributions
We automatically classify verbs into lexical semantic classes, based on distributions of indicators of verb alternations, extracted from a very large annotated corpus. We address a problem which is particularly di cult because the verb classes, although semantically di erent, show similar surface syntactic behavior. Five grammatical features are su cient to reduce error rate by more than 50% ov...
متن کامل